Ayush Patel
There is stuff you want from web pages.
You know R.
Here you can learn how to use R to get stuff from web page.
{rvest} functions{rvest}{purrr} functions{Rselenium} for browser automationA markup language. What is a markup language?
Structure of the webpage
Has Elements. What are elements?
Elements tells browser how to display content
This is how html looks 1.
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
Below is an HTML Element.
<tag>stuff to be displayed on the webpage </tag>
Starts with <tag>
Ends with </tag>
Elements can be nested, meaning you can have one element within another
Examples can be headings, paragraphs, lists etc
There can be multiple elements of the same type in a web page. How would you identify different elements of the same type? Each element will have a unique xpath or both.
Elements of same type, say a heading, can have different attributes.
Consider this example
<h3 style="color:blue;text-align:center">This is a header</h3>
<h3 style="color:red;text-align:right">This is a header</h3>
HTML code
<p>
This is the random stuff that comes out of my brain. It has no meaning. How many tomatoes are there in ketchup bottle.
<ul>
<li>potatoes, but potatoes have eyes
<li>Stop this madness</li>
</li>
<li>tomatoes, do tomatoes have toes??</li>
</ul>
</p>
HTML output
This is the random stuff that comes out of my brain. It has no meaning. How many tomatoes are there in ketchup bottle.
The Family
Here we will try to scrap from a webpage without really knowing the details of the functions.
Navigate to https://www.giantbomb.com/looney-tunes/3025-714/characters/ in your browser.
Take good look at this website
Here is what we want:
Load libraries
Run this command, is returns the html code of the webpage
Use CSS selector or the inspect element in browser to find element of character names
The output
{xml_nodeset (37)}
[1] <h3 class="display-view">A brand of Warner Bros. Inc. It began as a cart ...
[2] <h3>\n \n \n \n\n<dt id="js-field-label--fran ...
[3] <h3 class="title">Barnyard Dawg</h3>
[4] <h3 class="title">Big Chungus</h3>
[5] <h3 class="title">Bugs Bunny</h3>
[6] <h3 class="title">Daffy Duck</h3>
[7] <h3 class="title">Dr. Moron</h3>
[8] <h3 class="title">Egghead Jr.</h3>
[9] <h3 class="title">Elmer Fudd</h3>
[10] <h3 class="title">Foghorn Leghorn</h3>
[11] <h3 class="title">Gossamer</h3>
[12] <h3 class="title">Hamton J. Pig</h3>
[13] <h3 class="title">Hector the Bulldog</h3>
[14] <h3 class="title">Henery Hawk</h3>
[15] <h3 class="title">Hippety Hopper</h3>
[16] <h3 class="title">Hugo the Abominable Snowman</h3>
[17] <h3 class="title">K-9</h3>
[18] <h3 class="title">Lola Bunny</h3>
[19] <h3 class="title">Marvin The Martian</h3>
[20] <h3 class="title">Melissa Duck</h3>
...
Add this to the code
[1] "A brand of Warner Bros. Inc. It began as a cartoon series which spawned a number of tie-ins including video games."
[2] "\n \n \n \n\nSummaryShort summary describing this franchise.\n "
[3] "Barnyard Dawg"
[4] "Big Chungus"
[5] "Bugs Bunny"
[6] "Daffy Duck"
[7] "Dr. Moron"
[8] "Egghead Jr."
[9] "Elmer Fudd"
[10] "Foghorn Leghorn"
[11] "Gossamer"
[12] "Hamton J. Pig"
[13] "Hector the Bulldog"
[14] "Henery Hawk"
[15] "Hippety Hopper"
[16] "Hugo the Abominable Snowman"
[17] "K-9"
[18] "Lola Bunny"
[19] "Marvin The Martian"
[20] "Melissa Duck"
[21] "Miss Prissy"
[22] "Nasty Canasta"
[23] "O'Mike"
[24] "O'Pat"
[25] "Penelope Pussycat"
[26] "Pepe Le Pew"
[27] "Petunia Pig"
[28] "Porky Pig"
[29] "Ralph Wolf"
[30] "Road Runner"
[31] "Sam Sheepdog"
[32] "Speedy Gonzales"
[33] "Top contributors to this wiki"
[34] "Pick a List"
[35] "Comment and Save"
[36] " Thanks, we're checking your submission.\n "
[37] ""
try using h3.title instead. Can you guess what happened? Save the output as an object, say char_name
A problem you will face:
p element is not just used for the description of the characters in this webpage.So, what to do:
Do you see a pattern?
# function to get the description
get_description <- function(xp){
read_html("https://www.giantbomb.com/looney-tunes/3025-714/characters/") |>
html_element(xpath = xp) |>
html_text()
}
# generate all xpaths that you want
vec_all_xpaths <- paste0('//*[@id="wiki-3025-714-characters"]/ul[1]/li[',c(1:30),']/a/p')
# get description for all characters
char_description <- purrr::map_chr(.x = vec_all_xpaths,.f = get_description)
# create the data frame
tibble::tibble(
character = char_name,
description = char_description
)read_html() 1Required Input: The URL of the webpage as string. Can be other things literal xml or html
Output: html code. The class of the output is usually xml_document, xml_node
This function is used to get all the details(html code) of a webpage. This output can be further used to extract desired parts
read_html examplervest::read_html("https://www.giantbomb.com/looney-tunes/3025-714/characters/") -> looney_page
looney_page{html_document}
<html lang="en" class="no-js no-touch ">
[1] <head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8 ...
[2] <body id="default-body" class="body--legacy wiki_object col-2-template " ...
looney_page has the html code for the webpage from exercise.
html_elements() or html_element() 1Required Input (x): A document, node or nodes. Like looney_page
Required Input (css or xpath): Either of CSS selector value or the xpath of the desired element(s). Like h3.title or //*[@id="wiki-3025-714-characters"]/ul[1]/li[1]/a/h3
Output: Finds the elements specified elements and returns. Class of the output is usually xml_node or xml_nodeset.
html_elements() or html_element() example{xml_nodeset (30)}
[1] <h3 class="title">Barnyard Dawg</h3>
[2] <h3 class="title">Big Chungus</h3>
[3] <h3 class="title">Bugs Bunny</h3>
[4] <h3 class="title">Daffy Duck</h3>
[5] <h3 class="title">Dr. Moron</h3>
[6] <h3 class="title">Egghead Jr.</h3>
[7] <h3 class="title">Elmer Fudd</h3>
[8] <h3 class="title">Foghorn Leghorn</h3>
[9] <h3 class="title">Gossamer</h3>
[10] <h3 class="title">Hamton J. Pig</h3>
[11] <h3 class="title">Hector the Bulldog</h3>
[12] <h3 class="title">Henery Hawk</h3>
[13] <h3 class="title">Hippety Hopper</h3>
[14] <h3 class="title">Hugo the Abominable Snowman</h3>
[15] <h3 class="title">K-9</h3>
[16] <h3 class="title">Lola Bunny</h3>
[17] <h3 class="title">Marvin The Martian</h3>
[18] <h3 class="title">Melissa Duck</h3>
[19] <h3 class="title">Miss Prissy</h3>
[20] <h3 class="title">Nasty Canasta</h3>
...
looney_chars has the characters from the webpage, but not it text or string format.
html_text()1html_elements() get what we want, just not how we want it.
This is where html_text() can help.
Required Input (x): A document, node or nodes. Like looney_chars
Other Inputs (trim) : Remove spaces from the beginning and end
Returns a character vector.
html_text() example [1] "Barnyard Dawg" "Big Chungus"
[3] "Bugs Bunny" "Daffy Duck"
[5] "Dr. Moron" "Egghead Jr."
[7] "Elmer Fudd" "Foghorn Leghorn"
[9] "Gossamer" "Hamton J. Pig"
[11] "Hector the Bulldog" "Henery Hawk"
[13] "Hippety Hopper" "Hugo the Abominable Snowman"
[15] "K-9" "Lola Bunny"
[17] "Marvin The Martian" "Melissa Duck"
[19] "Miss Prissy" "Nasty Canasta"
[21] "O'Mike" "O'Pat"
[23] "Penelope Pussycat" "Pepe Le Pew"
[25] "Petunia Pig" "Porky Pig"
[27] "Ralph Wolf" "Road Runner"
[29] "Sam Sheepdog" "Speedy Gonzales"
html_table()1Along with text, there are tables on webpages that we want.
Required Input (x): A document, node or nodes. It expects the outputs of either of read_html, html_elements
TRUE, if set to NA it will use the first row as header if there is a
tag
Other Inputs (trim) : Remove spaces from the beginning and end
Other Inputs (dec) : Which character to use as a decimal. Some countries have , as a decimal.
Returns a tibble or list of tibbles if applied on multiple elements
html_table() exampleGo to Department of Expenditure’s contact details page.
We want the tables on this page.
rvest::read_html("https://doe.gov.in/whos-who")|>
rvest::html_elements("table")|>
rvest::html_table()[[1]]
# A tibble: 10 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <lgl> <chr> <chr> <chr> <chr> <chr> <chr>
1 Smt Nirmala Si… Financ… NA 230925… "23793… "" 134, N… "15, S… "app…
2 Shri S.S. Nakul Privat… NA 230925… "" "" 136-A,… "" ""
3 Shri Vivek Sin… OSD to… NA 230925… "" "5676,… 137A, … "" ""
4 Shri Ankit Jal… Addl. … NA 230925… "" "5676,… 142-A,… "" "fmo…
5 Shri B.N. Bhas… Addl. … NA 230925… "" "5676,… 137-A,… "" ""
6 Shri Karma Son… Addl. … NA 230925… "" "" 137-A,… "" ""
7 Shri Sernya Bh… 1st PA… NA 230925… "" "" 136, N… "" ""
8 Shri Anil Yadav Under … NA 230925… "" "5676,… 135, N… "" ""
9 Shri Ashok Raw… PPS NA 230925… "" "5676,… 142-A,… "" ""
10 Shri Ram Rasik… Under … NA 230925… "" "" 167-A,… "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[2]]
# A tibble: 10 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <chr> <lgl> <chr> <chr> <chr>
1 Shri Pankaj Ch… Minist… "" "23093… "" NA "138, … "" "mos…
2 Shri Kumar Rav… PS to … "" "23093… "" NA "North… "" ""
3 Shri Alkesh Ut… Addl. … "" "23093… "" NA "142, … "" ""
4 Shri Gaurav Sh… US "" "23093… "" NA "144-A… "" ""
5 Sh. Neeraj Mis… APS to… "MoS" "23093… "" NA "" "" ""
6 Sh. Dhruv Nara… Ist PA… "MoS" "23093… "" NA "" "" ""
7 MOS Finance MOS Fi… "" "" "" NA "" "" ""
8 Dr. Bhagwat Ka… MOS Fi… "" "23093… "011 -… NA "165, … "302, … "mos…
9 Shri Amit Meena PS to … "" "23093… "" NA "166A,… "" ""
10 Shri Shambhu K… Under … "" "23093… "" NA "164, … "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[3]]
# A tibble: 5 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <lgl> <int> <dbl> <chr> <chr> <chr> <chr>
1 Dr. T. V. Soman… F.S. &… NA 2.31e7 NA 5610, … 129-A,… "" "sec…
2 Shri S. Sudarsh… PSO NA 2.31e7 NA 5624, … 129-C … "" ""
3 Shri Rakesh Kum… PPS NA 2.31e7 9.96e9 5610, … 129-C … "A-1/2… ""
4 Sh. Chanakya Ke… PPS NA 2.31e7 9.87e9 5610, … 129-C,… "892, … ""
5 Sh. Netra Pal S… Consul… NA 2.31e7 9.97e9 5624, … 129-C,… "138-A… ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[4]]
# A tibble: 6 × 9
Name Desig…¹ Divis…² Telep…³ Teleph…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <lgl> <chr> <dbl> <int> <chr> <chr> <chr>
1 . Additi… NA "" NA NA "" "" ""
2 Shri Rajesh Dh… PPS to… NA "011-2… NA 5679 "166-E… "" "raj…
3 Mrs. Kavita Ma… PPS NA "011-2… NA 5679 "166E,… "EA-25… "m[d…
4 Shri M K Sahoo Adviser NA "011-2… NA NA "504, … "" "mks…
5 Sh. Rana Mukes… PA to … NA "011-2… NA NA "" "" "ran…
6 Shri Sushobhan… SSO NA "011-2… 9.20e11 NA "505 1… "" "s[d…
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[5]]
# A tibble: 21 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
1 Smt.Annie G. M… Specia… "Perso… 011-23… NA 5648 39-A, … "Bunga… "mat…
2 Smt. Sudha Raj… PSO to… "Perso… 011-23… NA 5648 36, No… "" "Sud…
3 Ms. Divya Alat… Direct… "Admin" 230926… NA 5693 168-C,… "" "div…
4 Shri Shyam Kis… Sr. PP… "Admin" 011-23… NA 5616 169-A,… "" "kis…
5 Shri S.N. Rana Under … "GAD/C… 011-23… NA 5665 56A, N… "" "ran…
6 Sh. Ranjit Kum… Under … "Admin… 011-23… NA 5695 225-E,… "" "ran…
7 Sh. K.J. Bhatt Under … "Admn." 230957… NA 5722 225E, … "" ""
8 Sh. Pijush Moh… US (Vi… "" 230956… 9.54e9 5656 231/NB "32-A,… "pij…
9 Sh. Ravi Kumar SO "GAD" 230956… NA 5621, … 56A, N… "" ""
10 Shri Rajeshwar… Deputy… "Offic… 011-23… NA 5620 261, N… "" "raj…
# … with 11 more rows, and abbreviated variable names ¹Designation,
# ²`Division/section`, ³`Telephone (Office)`, ⁴`Telephone(Residence)`,
# ⁵`Intercom No.`, ⁶`Room No.`, ⁷`Address(Residence)`
# ℹ Use `print(n = ...)` to see more rows
[[6]]
# A tibble: 36 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <lgl> <chr> <chr> <chr> <chr>
1 Smt..Annie G. … Specia… Person… 230932… NA "5648" 39-A, … "Bunga… "mat…
2 Smt. Sudha Raj… PSO to… Person… 011-23… NA "5648" 36 Nor… "" "Sud…
3 Sh. THANGLEMLI… Joint … E.Coor… 230932… NA "5690" 74 -B/… "" "THA…
4 Smt. Nirmala D… Direct… EG 230932… NA "5623" 37, NB "" "n[d…
5 Sh. Avinash K … Deputy… E.II -A 230926… NA "5609" 48 -E,… "" "avi…
6 Sh. Umesh Kuma… Deputy… E. III… 011-23… NA "5715" 225-D,… "" "uk[…
7 Sh. B. Sengupta D.S. E.III-… 230927… NA "5723" 76 -A/… "" ""
8 Sh. B.K. Manth… D.S. E.III-… 230945… NA "5669" 74-C/NB "" ""
9 Shri Ram Gopal Deputy… E.III B 230922… NA "5726" 30A, N… "" ""
10 Shri R.D. Talu… D.S EMC 246279… NA "" 502/LNB "" ""
# … with 26 more rows, and abbreviated variable names ¹Designation,
# ²`Division/section`, ³`Telephone (Office)`, ⁴`Telephone(Residence)`,
# ⁵`Intercom No.`, ⁶`Room No.`, ⁷`Address(Residence)`
# ℹ Use `print(n = ...)` to see more rows
[[7]]
# A tibble: 14 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <dbl> <int> <chr> <chr> <chr>
1 Smt.Annie G. M… Specia… "Perso… 011-23… NA 5648 "39-A,… "Bunga… "mat…
2 Smt. Sudha Raj… PSO to… "Perso… 011-23… NA 5648 "36, N… "" "Sud…
3 Sh. V. Padmana… D.S. "RTI &… 246177… NA NA "501/L… "" ""
4 Smt. Pratima G… DDG "Exp." 246537… NA NA "515,L… "" ""
5 Sh. Shiv Ram M… Direct… "SIU" 246975… NA NA "505/L… "" ""
6 Sh. Kailash Ch… Under … "SIU" 246110… NA NA "503/L… "" ""
7 Sh......... Under … "" 246186… NA NA "506/L… "" ""
8 Sh. Devinder K… Under … "RTI &… 246545… 8.08e9 NA "504,L… "" ""
9 Smt. Uma Aggar… SO "" 246189… NA NA "" "" ""
10 Smt. Kavita Sa… PS "" 246189… NA NA "508,L… "" ""
11 Sh. Lalit Kumar SO "" 246189… NA NA "508,L… "" ""
12 Sh. Raj Kumar SO "RTI" 246545… 8.45e9 NA "511, … "" ""
13 Sh.Santosh Kum… Careta… "" 246189… NA NA "508/L… "E-163… ""
14 Sh. Rajesh Sha… SO "SIU" 246545… 9.01e9 NA "511, … "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[8]]
# A tibble: 1 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <lgl> <lgl> <chr> <lgl> <lgl>
1 Ms. Gurpreet Ka… SSO PRU 2.46e7 NA NA 511, L… NA NA
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[9]]
# A tibble: 13 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <lgl> <chr> <chr> <chr> <chr>
1 Shri Amit Sing… Joint … "PFC-I" 230948… NA 5652 142-B,… "" "neg…
2 Sh. L.K. Trive… Direct… "PFC.I" 230933… NA 5642 264 C,… "" "lk[…
3 Ms Swayamprava… Direct… "PFC.I" 230926… NA 5636 225-C,… "" "swa…
4 P. Parthiban Deputy… "PFC I" 011230… NA 5645 167 B,… "" "p[d…
5 Ms. Shalaka Ku… Dy. Di… "PFC.I" 230956… NA 5664 79, NB "" ""
6 CA Ranganath A… Deputy… "" 011-23… NA 5696 R No. … "Db001… "ran…
7 Sh. Partha Paul US "PFC.I" 230956… NA 5643 77,NB "" ""
8 Sh. Krishnakan… S.O-PF "PFC-I" 230956… NA 5622, … 65,NB "" ""
9 Sh. Mangal Pra… S.O. "PFC-I" 230956… NA 5651 77/NB "" ""
10 Rajesh Vermani S.O. "PFC-I" 230956… NA 5651 77,NB "" ""
11 Sh............… S.O. "PFC-I" 230956… NA 5622 65 "" ""
12 Ms. Subha Vija… S.O. "PFC-I" 230956… NA 5622 65/NB "" ""
13 Ms. Aruna Arora Asstt.… "PFC-I" 230956… NA 5622, … 65/NB "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[10]]
# A tibble: 12 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <dbl> <chr> <chr> <chr> <chr>
1 Dr. Sajjan Sin… "Addit… "PF St… 2.31e7 NA "5673" 169-C,… "" ""
2 Sh Ravinder Ku… "Sr. P… "PF St… 2.31e7 NA "5682" 143, N… "." ""
3 . "PPS t… "PF St… 2.31e7 NA "5682" 143, NB "" ""
4 Sh. Prateek Ku… "Direc… "PFC.I" 2.31e7 NA "5660" 76, No… "" "pra…
5 Sh Deependra K… "Direc… "PF St… 2.31e7 9.45e9 "5612" 145,NB "" "kum…
6 Shri G. S. Ana… "Direc… "PF-St… 2.31e7 NA "5691" 162,NB "" "gsa…
7 Ms. Anjali Mau… "Assis… "PF St… 2.31e7 NA "5697" 80, No… "" "mau…
8 Sh. Rabi Ranjan "Deput… "PFC.I" 2.31e7 NA "5672" 264,NB "" ""
9 Vacant "Asstt… "PF St… NA NA "" 79,Nor… "" ""
10 Shri Sumit Aga… "Deput… "PFS" 2.31e7 NA "5700" 79, NB "" "agr…
11 PF (State) "" "" 2.31e7 NA "5625,… 80,NB "" ""
12 Sh. "AAO" "PF-St… 2.31e7 NA "5726" 30 A "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[11]]
# A tibble: 7 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <lgl> <int> <chr> <lgl> <chr>
1 Shri Sanjay Pra… Addl. … "PF Ce… 2.31e7 NA 5720 161,NB NA "js[…
2 Sh. Tilak Raj G… PPS "" 2.31e7 NA 5698 163,NB NA ""
3 Mr. Amit Kumar Jt. Dir "PFC -… 2.31e7 NA 5688 162, NB NA ""
4 Ms. Hema Jaiswal Dir. "" 2.31e7 NA 5614 167-B,… NA ""
5 Sh. Puspendra S… Dy. Di… "PFC.I… 2.31e7 NA 5640 80, NB NA "pus…
6 Sh. Rangin Murmu Dy. Di… "PFC-I… 2.31e7 NA 5701 79, NB NA ""
7 Sh. Aayush Bans… Deputy… "PFC-I… 2.31e7 NA 5644 79, No… NA "abc…
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[12]]
# A tibble: 5 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <dbl> <int> <chr> <chr> <chr>
1 Shri Manoj Sahay Additi… "" 2.31e7 9.97e9 5685 166-C,… "C II/… "man…
2 Sh. Surendra Ku… PPS to… "" 2.31e7 NA 5603 163,NB "" ""
3 Ms. A. Seetha M… PPS to… "Fin/M… 2.31e7 NA 5603 163, NB "" ""
4 Sh. Deepak Math… Dy. Se… "Rev./… 2.31e7 NA 5401 71-A,/… "" ""
5 Shri Nitin Kumar S.O. "MD" 2.31e7 NA 5714 276-C,… "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[13]]
# A tibble: 6 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <dbl> <int> <chr> <chr> <chr>
1 Shri Manoj Sahay Additi… "" 2.31e7 9.97e9 5685 166-C,… "C II/… "man…
2 Sh. Surendra Ku… PPS to… "" 2.31e7 NA 5603 163,NB "" ""
3 Ms. A. Seetha M… PPS to… "Fin/M… 2.31e7 NA 5603 163, NB "" ""
4 Sh. Deepak Math… Dy. Se… "Rev./… 2.31e7 NA 5401 71-A,NB "" ""
5 Sh.Arvind Kumar… US "IFU" 2.31e7 NA 5607 225-E/… "" ""
6 Ms. Preeti Shar… S.O "IFU" 2.31e7 NA 5662 241-D,… "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[14]]
# A tibble: 2 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <lgl> <lgl> <int> <chr> <lgl>
1 Sh. Subrata Cha… US MC 2.46e7 NA NA 139 2nd Fl… NA
2 Sh. Vishwa Nath… SO MC 2.46e7 NA NA NA 2nd Fl… NA
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[15]]
# A tibble: 6 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <lgl> <int> <chr> <lgl> <lgl>
1 Dr. Sajjan Sing… Additi… "FCD" 2.31e7 NA 5673 169-C NA NA
2 Sh. Ravinder Ku… Sr. PP… "FCD" 2.31e7 NA 5682 143,NB NA NA
3 Mrs. Poonam Chh… PPS to… "FCD" 2.31e7 NA 5682 166-E,… NA NA
4 Shri Abhay Kumar Direct… "FCD" 2.44e7 NA NA .503,C… NA NA
5 Sh. Rajendra Ku… PPS to… "" 2.44e7 NA NA 502 NA NA
6 Shri Mahesh Kum… Deputy… "" 2.44e7 NA NA 508 NA NA
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[16]]
# A tibble: 5 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <int> <lgl> <int> <chr> <lgl> <chr>
1 Shri Sanjay Agg… "Advis… PPD 2.31e7 NA 5608 168-B,… NA "san…
2 Ms. Manju Kumari "PPS" PPD 2.31e7 NA 5708 169-A,… NA ""
3 Shri Kanwalpreet "Direc… PPD 2.31e7 NA 5683 264C/NB NA ""
4 US PPD "" PPD 2.46e7 NA NA 512, L… NA ""
5 Sh.. "SO" PPD 2.46e7 NA NA LNB NA ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[17]]
# A tibble: 18 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <lgl> <int> <chr> <lgl> <chr>
1 Shri Umesh Kum… Chief … Cost "24698… NA 8522 "201" NA "uks…
2 Sh. A.K. Gaur Sr. PPS Cost "24698… NA NA "202" NA ""
3 Sh. Gurubaksh … PS Cost "24698… NA 8522 "202,L… NA ""
4 Shri Amardeep … Adviser Cost "24618… NA 8906 "204-B… NA "ama…
5 Shri Pankaj Gu… Adviser Admn. "24698… NA 8435 "205-B" NA "gup…
6 Shri Rajesh Ya… Direct… Cost "24694… NA 4021 "209-B" NA ""
7 Shri Manoj Kum… Joint … Cost "24617… NA 7075 "205-C" NA "man…
8 Shri T.R.Sathi… Joint … Cost "24698… NA 8640 "205-A" NA "sch…
9 Shri Prakash H… Dy. Di… Cost "24653… NA 3487 "206, … NA "pra…
10 Ms. Priyanka S… Dy. Di… Cost "" NA NA "" NA "pri…
11 Shri Deepak Ga… Dy. Di… Cost "24653… NA 3487 "206, … NA "dee…
12 Shri Devanshi … Dy. Di… Cost "24653… NA 3487 "206, … NA "dev…
13 Ms. R Kalyanas… Asstt.… Cost "24692… NA 2541 "206, … NA "cma…
14 Shri Rahul Cha… Dy. Di… Cost "24692… NA 2541 "206, … NA "rah…
15 Shri Manoj Kum… Dy, Di… Cost "24653… NA 3487 "206,L… NA "ca[…
16 Shri Harsh Jos… Asstt.… Admn. "24692… NA 2541 "206, … NA "har…
17 Shri Pankaj Pa… Asstt.… Cost "24692… NA 2541 "206, … NA "pan…
18 Shri Mahesh Ch… Sectio… Admn. "24693… NA 3895 "208,L… NA "sha…
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[18]]
# A tibble: 5 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <lgl> <int> <lgl> <int> <chr> <lgl> <chr>
1 Sh. Alok Ranjan Chief … NA 2.31e7 NA 5729 240-B,… NA "cca…
2 Shri Vivekanand Contro… NA 2.31e7 NA 5730 241-A,… NA "ana…
3 Sh. Vikas Chand… DCA Fi… NA NA NA NA 401, E… NA "vc[…
4 Shri Himanshu S… ACA Fi… NA 2.31e8 NA NA 269, N… NA "sri…
5 Sh. Vikash Chan… Dy. Co… NA 2.31e7 NA NA R. No.… NA ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[19]]
# A tibble: 19 × 9
Name Desig…¹ Divis…² Telep…³ Telep…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <lgl>
1 Reception offi… "" "" 230956… NA "5617,… "" "" NA
2 Reception offi… "" "" 230956… NA "5605,… "" "" NA
3 Finance Canteen "" "" 230956… NA "5677,… "" "" NA
4 Tea Board "" "" 230956… NA "5670" "" "" NA
5 Coffee Board "" "" 230939… NA "5692" "" "" NA
6 Bikano "" "" 230956… NA "5601" "" "" NA
7 Driver Room "" "" 230956… NA "5675" "" "" NA
8 Electrical Enq… "" "" 230948… NA "" "" "" NA
9 Asstt. Enginee… "" "" 230923… NA "" "" "" NA
10 Sh. Vishal "(LG R… "" 987333… NA "" "" "" NA
11 Sh. M Krishnan "Jr. E… "AC/El… 230939… NA "" "" "96435… NA
12 Sh. D.C. Sharma "Asst.… "" 230939… 9.87e9 "" "" "" NA
13 Sh. Manish Kum… "Execu… "" 230923… 9.87e9 "" "" "" NA
14 CPWD (Civil) "" "" 230920… NA "" "" "" NA
15 CPWD (Electric… "" "" 230935… NA "" "" "" NA
16 Fire Station, … "" "" 230927… NA "" "" "" NA
17 JTO, MTNL, NB … "" "" 230920… NA "" "" "" NA
18 Internet Compl… "" "NIC- … 180011… NA "" "35AB" "" NA
19 Stationery Sto… "" "" 230956… NA "5658" "" "" NA
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
[[20]]
# A tibble: 3 × 9
Name Desig…¹ Divis…² Telep…³ Teleph…⁴ Inter…⁵ Room …⁶ Addre…⁷ Email
<chr> <chr> <chr> <chr> <dbl> <int> <chr> <chr> <chr>
1 Sh. Rajesh Mal… Princi… "Media… 011-23… 9.87e 9 5006 B-77, … "Flat … "dpr…
2 Ms. Gurmeet Bh… Sr. PPS "Media… 011-23… NA 5637 A-76, … "" "gur…
3 Sh. Kush Mohan… M&CO "" 230939… 1.00e10 5637 A-76 "" ""
# … with abbreviated variable names ¹Designation, ²`Division/section`,
# ³`Telephone (Office)`, ⁴`Telephone(Residence)`, ⁵`Intercom No.`,
# ⁶`Room No.`, ⁷`Address(Residence)`
What we have covered is sufficient as a building block of working knowledge to scrape webpages.